Pre-deployed models
LiveHub provides several pre-deployed Large Language Models (LLMs) that can be used in AI Agents.
The following models are currently provided:
-
OpenAI models:
-
gpt-4o
-
gpt-4o-mini
-
gpt-4.1
-
gpt-4.1-mini
-
gpt-4.1-nano
-
gpt-5
-
gpt-5-mini
-
gpt-5-nano
-
-
Google models:
-
gemini-2.0-flash
-
gemini-2.0-flash-lite
-
gemini-2.5-flash
-
gemini-2.5-flash-lite
-
gemini-3-flash
-
Each LLM model has its own strengths and weaknesses, excelling in some tasks and underperforming in others, and they also differ in speed and operational cost depending on their architecture and optimization.
Choosing between the models
For typical voice agent use cases, we recommend starting with one of the following models:
-
gpt-4o-mini
-
gpt-4.1-mini
-
gemini-2.5-flash
These models offer good balance of capabilities, latency and cost efficiency.
Experiment with different models, as a model’s performance is highly dependent on your prompts.
If you see that your agent struggles with following complex instructions, consider switching to larger models, for example, gpt-4o or gpt-4.1-mini.
Real-time models
Real-time (or speech-to-speech) models communicate with user using audio modality and completely bypass speech-to-text and text-to-speech services.
The following real-time models are pre-deployed in Live Hub:
-
gpt-realtime
-
gpt-realtime-mini
-
gemini-2.5-flash-native-audio
Real-time models are indicated by mark in models selector element.
Make sure to select enable voice streaming in Speech and Telephony tab, when working with real-time models.
For additional information, see Real-time models.